Goto

Collaborating Authors

 electricity usage


Accurate and Energy Efficient: Local Retrieval-Augmented Generation Models Outperform Commercial Large Language Models in Medical Tasks

arXiv.org Artificial Intelligence

Background The increasing adoption of Artificial Intelligence (AI) in healthcare has sparked growing concerns about its environmental and ethical implications. Commercial Large Language Models (LLMs), such as ChatGPT and DeepSeek, require substantial resources, while the utilization of these systems for medical purposes raises critical issues regarding patient privacy and safety. Methods We developed a customizable Retrieval-Augmented Generation (RAG) framework for medical tasks, which monitors its energy usage and CO2 emissions. This system was then used to create RAGs based on various open-source LLMs. The tested models included both general purpose models like llama3.1:8b and medgemma-4b-it, which is medical-domain specific. The best RAGs performance and energy consumption was compared to DeepSeekV3-R1 and OpenAIs o4-mini model. A dataset of medical questions was used for the evaluation. Results Custom RAG models outperformed commercial models in accuracy and energy consumption. The RAG model built on llama3.1:8B achieved the highest accuracy (58.5%) and was significantly better than other models, including o4-mini and DeepSeekV3-R1. The llama3.1-RAG also exhibited the lowest energy consumption and CO2 footprint among all models, with a Performance per kWh of 0.52 and a total CO2 emission of 473g. Compared to o4-mini, the llama3.1-RAG achieved 2.7x times more accuracy points per kWh and 172% less electricity usage while maintaining higher accuracy. Conclusion Our study demonstrates that local LLMs can be leveraged to develop RAGs that outperform commercial, online LLMs in medical tasks, while having a smaller environmental impact. Our modular framework promotes sustainable AI development, reducing electricity usage and aligning with the UNs Sustainable Development Goals.


DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

arXiv.org Artificial Intelligence

Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) have demonstrated impressive language/vision reasoning abilities, igniting the recent trend of building agents for targeted applications such as shopping assistants or AI software engineers. Recently, many data science benchmarks have been proposed to investigate their performance in the data science domain. However, existing data science benchmarks still fall short when compared to real-world data science applications due to their simplified settings. To bridge this gap, we introduce DSBench, a comprehensive benchmark designed to evaluate data science agents with realistic tasks. This benchmark includes 466 data analysis tasks and 74 data modeling tasks, sourced from Eloquence and Kaggle competitions. DSBench offers a realistic setting by encompassing long contexts, multimodal task backgrounds, reasoning with large data files and multi-table structures, and performing end-to-end data modeling tasks. Our evaluation of state-of-the-art LLMs, LVLMs, and agents shows that they struggle with most tasks, with the best agent solving only 34.12% of data analysis tasks and achieving a 34.74% Relative Performance Gap (RPG). These findings underscore the need for further advancements in developing more practical, intelligent, and autonomous data science agents.


Time & Seasonality Features in Time Series

#artificialintelligence

Time and seasonality features are often assumed in time series analysis, ignoring their crucial role as an input in model calibration. More recently cyclic seasonality variables have become highly popular in data science. Finding the optimal form of seasonality effects however should be part of the model-building process, and not simply taken for granted. Cyclic seasonality features through trigonometric sine and cosine transformations have long been used in biomedical and ecology research. The intuition behind them is that time is cyclical. A plot of the trigonometric transformation of hours is shown below.


Deep understanding of the ARIMA model

#artificialintelligence

It is worth noting that the observed data is uniquely orderly according to the time of observation, but it doesn't have to be dependent on time, i.e. time (index of the observations) doesn't have to be one of the independent variables. Stationarity: a stationary process is a stochastic process, whose mean, variance and autocorrelation structure do not change over time. It can also be defined formally using mathematical terms, but in this article, it's not necessary. Intuitively, if a time series is stationary, we look at some parts of them, they should be very similar -- the time series is flat looking and the shape doesn't depend on the shift of time. It surely isn't, since it's not stochastic, stationarity is not one of its properties) Figure 1.1 shows the simplest example of a stationary process -- white noise.


A Relaxing Weekend with Samsung's AI-enabled and Smart Appliances

#artificialintelligence

It sounds like something out of a fairytale: when you head out for the day, your fairy godmother zips around the house, finishing all the housework so you don't have to. Okay, okay (earmuffs, kids!), fairy godmothers aren't exactly real, but with Samsung's AI-enabled and smart home appliances, you'd never know the difference. Samsung home appliances' seamless connections are all part of the company's "Intelligence of Things" vision, which envisions a world in which the devices around us are constantly communicating with one another in ways that simplify our day. To demonstrate how the devices take the work out of housework, we present a typical Sunday in one of these connected homes. You've got a lot to do today, so you wake up early, kick off the covers, and start to map out your day.


Detection of Malfunctioning Smart Electricity Meter

arXiv.org Machine Learning

In this paper, a method for malfunctioning smart meter detection, based on Long Short-Term Memory (LSTM) and Temporal Phase Convolutional Neural Network (TPCNN), is proposed originally. This method is very useful for some developing countries where smart meters have not been popularized but in high demand. In addition, it is a new topic that people try to increase the service life span of smart meters to prevent unnecessary waste by detecting malfunctioning meters. We are the first people complete a combination of malfunctioning meters detection and prediction model based on deep learning methods. To the best our knowledge, our approach is the first method that achieves the malfunctioning meter detection of specific residential areas with their residents' data in practice. The procedure proposed creatively in this paper mainly consists of four components: data collecting and cleaning, prediction about electricity consumption based on LSTM, sliding window detection, and single user classification based on CNN. To make better classifying of malfunctioned user meters, we combine recurrence plots as image-input and combine them with sequence-input, which is the first work that applies one and two dimensions as two paths CNN's input for sequence data classification. Finally, many classical methods are compared with the method proposed in this paper. After comparison with classical methods, Elastic Net and Gradient Boosting Regression, the result shows that our method has higher accuracy. The average area under the Receiver Operating Characteristic (ROC) curve is 0.80 and the standard deviation is 0.04. The average area under the Precision-Recall Curve (PRC) is 0.84.


Data Analytic Policy Design Applied to Energy Conservation in College Dormitories

AAAI Conferences

We study the design of data analytic policies in a campus dormitory where smart meters are installed to gather usage data. Given the availability of such data, we consider policies to give feedback on comparative usage levels on a daily basis, and give price incentives accordingly. This requires us to divide users into groups according to their behaviors, and set prices that are reasonable. Instead of doing grouping and price setting based on intuition and guesses, which may be ineffective and unfair, we propose a data analytic approach. This requires us to start the design with a clear set of principles; based on these, and the collected data, the user grouping and corresponding pricing are automatically determined, satisfying the agreed-to principles. We show how this design approach works in a real setting, with real world usage data. We also discuss the difficulties in introducing such policies as they are more complicated and involve some uncertainties, and a possible solution by using opt-in (or opt-out) at the first introduction of such new policies. We expect the data analytic policy approach and our experience to be applicable and useful in general settings.